3-Computer Science-Software-Data Hierarchy-File

Comma-Separated Value file

Text files {Comma-Separated Value file} (CSV) can contain record lists, with record values separated by commas. Values are in quotes. Records end with newline character.

DOC file

Microsoft(R) Word, WordPerfect (TM), and similar word-processing programs save documents as binary files {DOC file} with DOC extension. Word-processing programs can convert DOC-file formats. In Microsoft Word, files can have Table of Contents (TOC) and Index, as well as links, fields, and hidden text.

contents

Heading paragraph styles determine TOC paragraph styles and indentation levels. TOC page numbers link to headings.

index

Text can have index fields, which you can hide or show. In Microsoft Word, field has format: special-left-brace_space_XE_space_phrase_space_special-right-brace. To search for fields, use ^d. To insert fields, use Ctrl-F9 or Insert menu and Field command. Page numbers in Index are not links.

Framemaker file

Adobe Framemaker (TM) word-processing program uses a proprietary document format {Framemaker file} with extension FM. You can import files from other formats and export Framemaker files to other formats. Other applications typically cannot open Framemaker files. Adobe Acrobat (TM) can open Framemaker files.

HTML file

Hypertext Markup Language (HTML) is an example of Standard Generalized Markup Language (SGML). Such files (SGML file) {HTML file} are text files.

tags

HTML uses tags to open and close formatting. Tags can be in any order or relation. Tags start with less-than signs and close with greater-than signs.

semantics

HTML has no semantics.

comment

<!-- comment --> inserts comment.

format

HTML files have a document-type line first, heading section second, and body section last.

document type line

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> starts HTML files.

beginning and ending lines

<HTML LANG="en-US"> is after the document-type line. </HTML> is at file end.

heading block

<HEAD> is after <HTML>, to start heading block.

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"> names content type for search engines.

<META HTTP-EQUIV="Expires" CONTENT="0"> expires page for search engines.

<TITLE>Title</TITLE> inserts title.

<META HTTP-EQUIV="keywords" CONTENT="keyword1, ..., keyword25"> or <META NAME="keywords" CONTENT="keyword1, ..., keyword25"> specifies up to 25 keywords for search engines.

<META NAME="description" CONTENT="25-word description"> describes HTML pages for search engines, using up to 25 words.

<META NAME="robots" CONTENT="noindex"> excludes indexing by search engines. <META NAME="robots" CONTENT="nofollow"> excludes searching for links by search engines. <META NAME="robots" CONTENT="noindex, nofollow"> excludes indexing by search engines and searching for links by search engines.

<STYLE TYPE="text/css"> starts the list of style attributes. HTML styles include: P, H1, H2, H3, H4, H5, H6, H7, A, and LI. P is paragraph. H1 through H7 are headings. A is anchor for name, id, and references. LI is list element. HTML styles, such as "color: blue" are in braces. text-align: center; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: bold; font-style: italic; color: red; text-decoration: none; are examples.

<LINK HREF="stylesheet.css" REL="stylesheet" TYPE="text/css"> points to a style sheet.

</STYLE> ends the list.

HTML can use CGI to add programs. <SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript" SRC="path/file.js"></SCRIPT> specifies Java script to run program. <IMG ALT="Counter" SRC="/cgi-bin/counter.pl"> specifies Counter image. <A HREF="http://.../cgi-bin/guestbook.cgi">View My Guestbook</A> uses guestbook.

</HEAD> ends heading block.

body beginning and ending

After heading block, <BODY> starts body block. <BODY STYLE="background-image: url(path/file_name.gif); background-repeat: no-repeat;"> uses the STYLE attribute to specify parameters. At line before </HTML>, </BODY> ends BODY block.

paragraph, heading, and section

<P>...</P> is for paragraph, which has blank lines before and after.

<BR> is for line feed.

<H1>...</H1> is for heading 1. Headings 2 through 7 are similar.

<DIV>...</DIV> is for text section or page division, with blank lines before and after.

<SPAN>...</SPAN> is for paragraph phrase.

<OL> <LI>...</LI> </OL> is for list.

style tag and attribute

<P STYLE="text-align: center; font-family: Verdana, Arial, Helvetica, sans-serif; font-weight: bold; font-size: 12px; font-style: italic; color: red; text-decoration: none;"> specifies parameters.

For font-family, use fonts found on all browsers, such as Courier, Times, Arial, or Helvetica. font-family:sans-serif includes Verdana, Arial, and Helvetica. font-family:serif includes Times and Courier.

For font-size, use font-size:larger, font-size:smaller, or font-size:normal.

For font-weight, use font-weight:bold or font-weight:normal.

For font-style, use font-style:italic or font-style:normal.

For color, use hexadecimal string, such as color:FFFFFF for black and color:000000 for white, or use color name, such as color:blue, color:red, or color:green.

For text-decoration, use text-decoration:none to remove underlines from links.

table

<TABLE BORDER="0" WIDTH="640" CELLPADDING="0" CELLSPACING="0" > starts table, with border, width, and cell size. WIDTH="100%" uses percent. <TR>...</TR> is for table row. <TD COLSPAN="3" WIDTH="100%" ALIGN="left" VALIGN="middle">...</TD> is for table-row detail, such as column entry or line, with column span, width, horizontal alignment, and vertical alignment. </TABLE> ends table.

image

<IMG SRC="/x/x/xxx.gif" WIDTH="15" HEIGHT="10" HSPACE="0" VSPACE="0"> is for image, with source file, width, height, horizontal spacing, and vertical spacing.

jumping

<A HREF="#Glossary">Glossary</A> jumps to same-file position marked by <A NAME="Glossary"></A>. If there is no marked position, program searches for first occurrence in file.

anchor and linking

<A HREF="path/file_name">link_name</A> is for linking to a file or URI by clicking on the link name. <A HREF="path/file_name#ID">link_name</A> is for linking to a file or URI and jumping to the ID in the file. <A HREF="path/file_name.jpg" TARGET="_blank">image</A> opens a new window. <A HREF="http://x/x/xxx.pdf" NAME="x" ONMOUSEOVER="ft(this)"></A> shows something when cursor mouses over text.

URI Escape Characters

Use % for Unicode (and ISO-Latin) hexadecimal numbers. Always encode the following URI characters. space = %20. double quotation mark = %22. # = %23. % = %25. < = %3C. > = %3E. [ = %5B. backslash = %5C. ] = %5D. ^ = %5E. ` = %60 = opening single quote. { = %7B. | = %7C. } = %7D. ~ = %7E. Period, apostrophe, and hyphen are OK. Do not use parentheses, underscore, exclamation mark, or asterisk.

URI reserved characters define syntax. If used in URIs, encode them. $ = %24. & = %26. + = %2B. , = %2C. / = %2F. : = %3A. ; = %3B. = = %3D. ? = %3F. @ = %40.

characters

Unicode (and ISO-Latin) characters 0 to 31 and 127 are ASCII control characters and do not print. tab = 9. carriage return = 13.

HTML recognizes the following Unicode, ASCII, and ANSI characters: blank ! double quotation mark # $ % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; = ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ backslash ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z right brace left brace | ~ ¢ £ ± µ.

HTML Escape Characters

For other characters or for escaping characters, use &#NNN; for Unicode (and ISO-Latin) decimal numbers or &...; for alphabetic. 000 to 031 have no printing characters. 009 = tab. 013 = carriage return.

032 = sp = blank = space. 033 = excl = !. 034 = quot = double quotation mark. 035 = num = #. 036 = dollar = $. 037 = percnt = %. 038 = amp = & (ampersand). 039 = apos = ' = apostrophe = closing single quote. 040 = lpar = (. 041 = rpar = ). 042 = ast = *. 043 = plus = +. 044 = comma = ,. 045 = hyphen = dash = -. 046 = period = . 047 = sol = / (solidus). 048 to 057 are digits 0 to 9. 058 = colon = :. 059 = semi = ;. 060 = lt = <. 061 = equals = =. 062 = gt = >. 063 = quest = ?. 064 = commat = @.

065 to 090 are letters A to Z.

091 = lsqb = [. 092 = bsol = backslash. 093 = rsqb = ]. 094 = caret = ^. 095 = lowbar = _. 096 = ` (accent grave).

097 to 122 are letters a to z.

123 = lcub = left brace = left curly bracket. 124 = verbar = |. 125 = rcub = right brace = right curly bracket. 126 = tilde = sim = ~. 127 is for Delete.

ISO-Latin characters 128 to 255 are not in the ASCII set and may not print.

Unicode does not use ANSI 128 to 159.

Unicode and ANSI control characters

000 = NUL (Null character). 001 = SOH (Start of Header). 002 = STX (Start of Text). 003 = ETX (End of Text). 004 = EOT (End of Transmission). 005 = ENQ (Enquiry). 006 = ACK (Acknowledgment). 007 = BEL (Bell). 008 = BS (Backspace). 009 = HT (Horizontal Tab). 010 = LF (Line Feed). 011 = VT (Vertical Tab). 012 = FF (Form Feed). 013 = CR (Carriage Return). 014 = SO (Shift Out). 015 = SI (Shift In). 016 = DLE (Data Link Escape). 017 = DC1 (XON)(Device Control 1). 018 = DC2 (Device Control 2). 019 = DC3 (XOFF or Device Control 3). 020 = DC4 (Device Control 4). 021 = NAK (Negative Acknowledgement). 022 = SYN (Synchronous Idle). 023 = ETB (End of Transmission Block). 024 = CAN (Cancel). 025 = EM (End of Medium). 026 = SUB (Substitute). 027 = ESC (Escape). 028 = FS (File Separator). 029 = GS (Group Separator). 030 = RS (Request to Send or Record Separator). 031 = US (Unit Separator).

PDF file

Portable Document Format {PDF file} depends on a format {PostScript} (PS file) from Adobe Systems. PDF files can be unstructured, structured, or tagged, with increasing file-structure information, such as bookmarks, pages, tables, lists, and images. The freely downloadable Adobe Reader application reads PDF files. Adobe Acrobat and Adobe Distiller make PDF files.

SGML file

Standard Generalized Markup Language {SGML file} defines electronic-document structure and content. Structured documents allow searching and semantics.

XML file

Extensible Markup Language {XML file} is a Standard Generalized Markup Language (SGML).

tags

XML uses text blocks begun and ended by tags. Tags can have attributes, which can have values. Well-formed documents conform to XML language.

document type

Document Type Definition (DTD) defines document structure using tag tree. Valid documents conform to a DTD.

types

XML Linking Language (XLL) has link semantics, extended links to other documents, pointers to document parts, and links in both directions. XML Stylesheet Language (XSL) tells how to parse XML documents. Simple API for XML (SAX) allows programming, using tag-sequence events. Document Object Model (DOM) allows programming, using API-object tag trees.

languages

XML languages include Chemical Markup Language (CML), MathML, Bioinformatic Sequence MArkup Language (BSML), Biopolymer MArkup Language (BioML), GAME for Drosophila sequence information, and BlastXML for search output.

data

Modeling, storing, and querying data needs data structure. Data often grows exponentially, has complex relationships, has new data types generated from old data, has new relation types, needs archiving, has many objects with much data, has updates, and has users with different needs, skills, and tools. XML schema adds object-oriented relational database concepts. XML can be for Web interfacing among object-oriented relational databases. Only object-oriented approaches can model data. Object-oriented relational database management systems (OODBMS) have query languages, many relation types, constraints, symmetry, data clustering, many data types, tables, triggers, indexing, inheritance, security, access, methods, and objects. Relation can be not-null, unique, test, and one-of.

Related Topics in Table of Contents

3-Computer Science-Software-Data Hierarchy

Drawings

Drawings

Contents and Indexes of Topics, Names, and Works

Outline of Knowledge Database Home Page

Contents

Glossary

Topic Index

Name Index

Works Index

Searching

Search Form

Database Information, Disclaimer, Privacy Statement, and Rights

Description of Outline of Knowledge Database

Notation

Disclaimer

Copyright Not Claimed

Privacy Statement

References and Bibliography

Consciousness Bibliography

Technical Information

Date Modified: 2022.0225